Mixed-lingual spoken word recognition by using VQ codebook sequences of variable length segments
نویسندگان
چکیده
We are investigating unsupervised phone modeling. This paper describes a derivation method of VQ codebook sequences of variable length segments from spoken word samples, and also describes evaluation results by applying the method to mixed-lingual speech recognition tasks which include non-native speakers. The VQ codebook is generated based on a piecewise linear segmentation method which includes segmentation, alignment, reduction and clustering processes. Derived codebook sequences are evaluated by speaker independent recognition of a word set which is a mixture of English and Japanese word. Speech samples are uttered by both English and Japanese native speakers. The recognition rates of mixed-lingual 618 words by using a codebook consist of 128 codes are 89.7% for English native speakers and 79.4% for Japanese native speakers in average .
منابع مشابه
Extracting phonological chunks based on piecewise linear segment lattices
The task of our research is to form phone-like models and a phoneme-like set from spoken word samples without using any transcriptions except for the lexical identi cation of each word in a vocabulary. This framework is derived from two motivations: 1) automatic design of optimal speech recognition units and structures of phone models, and 2) multi-lingual speech recognition based on languagein...
متن کاملA Vector Quantization Approach to Speaker Recognition
CH2118-8/85/0000-0387 $1.00 © 1985 IEEE 387 ABSTRACT. In this study a vector quantIzation (VQ) codebook was system. In the other, Shore and Burton 112] used word-based VQ used as an efficient means of characterizing the short-time spectral codebooks and reported good performance in speaker-trained isolatedfeatures of a speaker. A set of such codebooks were then used to word recognition experime...
متن کاملYLAB@RU at Spoken Term Detection Task in NTCIR-9
The information retrieval based on speech recognition is an important technique to easy access to large amount of mul-timedia contents including speech. The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted. This paper proposes a new method of STD based on the vector quantization (VQ). Spoken documents are repre...
متن کاملVariable dimension VQ encoding and codebook design
A variable dimension vector quantizer (VDVQ) has codewords of unequal dimensions. Here, a trellis-based sequential optimal VDVQ encoding algorithm is proposed. Also, a VDVQ codebook design algorithm based on splitting a node with equal or reduced dimensions is proposed that does not require any codebook parameter to be prespecified unlike known schemes. The VDVQ system is shown to outperform a ...
متن کاملVQ-faces - unsupervised face recognition from image sequences
In this paper we propose a new method for unsupervised face recognition – VQ-faces, which operates on a sequential stream of face images and is able to handle both frontal and side-view faces at the same time. The method consists of two parts: in the first part, the VQ-faces are calculated as prototype vectors of local areas in image-space, coding for different face-views (i.e. a “view codebook...
متن کامل